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ABSTRACT 

The visible claims of the current accountability 
movement are examined critically, and an alternative philosophy and a 
developing system are offered. Areas examined include the 
psychological implications of accountability philosophies for 
teaching staff, certain educational measurement problems, and the 
availability or adequacy of operating systems required to support 
broad applications of accountability. The alternative philosophy 
offered focuses on operations for which teachers may legitimately be 
held responsible, as opposed to the current movement's focus on 
student outcomes. The comprehensive supporting information system is 
presented in broad outlines, highlighted by a presentation of an 
evaluative component called Comprehensive Achievement Monitoring. 
(Author/LH) 
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Schutz (1971) somewhat facetiously, labeled roles of measurement in 
education as including "... servant, soulmate, stool pigeon, statesman 
[and] scapegoat ..." Each measurement era, it seems, emphasizes one or 
more of these roles to the relative exclusion of the others. Our obser- 
vations of .che current measurement era suggest that actors in the 
roles of stool pigeon and scapegoat are receiving the best reviews, 
while those in the more constructive roles of servant and soulmate are 
waiting in the wings. Sadly enough, this wait may ho. extended as the 
schools become the laboratory for a movement which again places the cart 
before the horse. We refer, of course, to the contemporary movement to- 
ward accountability whose unconstructive connotations may be partially 
communicated by its synonyms: reckoning, tidings, charge, score, inven- 

tory, report, and consequence. Judging from our ongoing efforts to es- 
tablish perf ormanced-based management information systems in the schools, 
it is the unconstructive connotations that teachers and others now asso- 
ciate with accountability in education. 

Advocates of educational accountability, on the other hand, suggest 
that the movement will systematically aid educators to improve student 
achievement and effect needed changes in the schools (Lessingcr, 1970). 
What can hardly yet be called a system appears to have as its prime ele- 
ments: (a) evaluation procedures intended to measure growth in relation 

to desired outcomes of instruction, usually involving the use of standard 
ized tests; (b) some form of public reporting of the extent to which 
desired outcomes were achieved; and (c) rewards to persons involved in 
the educational process, contingent upon the extent to which desired 
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outcomes are achieved. These elements may be augmented by additions to 
the school program to be subjected to test, and internal or external con- 
tractual agreements for program evaluation and/or the management of in- 
struct ion. 

Though recognizing the constructive spirit of accountability, we 
would nevertheless argue that current translations of the concept into 
operations (e.g. , Dyer, 1970) and rational constructions (e.g., Barro, 

1970) fall to take into account major contingent problems, which may pre- 
vent the movement from achieving any practical results in the contemporary 
scene. A first set of problems relates to the recognition of the psycho- 
logical implications of the language and nature of the accountability 
movement for teachers and other professional staff. Evaluation procedures, 
for example, are explicitly external, (Martin, and Blaschke, 1971; 

Lessinger, 1970); the criteria for evaluation are further likely to be 
selected or designed by staff external to the school (Lessinger, 1970), 
and there Is clear intent to report discrepancy data to such audiences as 
parents and boards of education. The most dense clinician would recog- 
nize In this set of events a clear cut basis for anxiety induction in the 
educational community, and would expect the usual chain of psychological 
reactions, ranging from expressed hostility to organized defensive responses 
designed to remove or afford escape from the offending stimulus. More 
extreme end results may include a temporary increase in the local base 
rate of paranoia, where accountability applications turn the educational 
and surrounding community into camps of accounters and accountants. 

A second set of problems arises when the discrepant data a re reported 
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to the public because the accountability philosophy may easily erode any 
remaining local confidence in the schools. This is particularly likely 
where an Irresponsible press interprets test data only dimly understood 
by the public in the first place, and where there are organized community 
groups whose fixed orientation is to negatively reinforce the educational 
staff. 

A third set of problems for the. accountability philosophy relates to 
the professional implications resulting from externally imposed additions 
or changes to instructional programs. If the negative stimulus value of 
public disclosure of student performance data and "objective" staff evalu- 
ation are not sufficient to motivate instructional staff, then surely they 
will react defensively to outsiders assuming instructional and program 
responsibilities. On this matter, one may seriously question the general- 
izability of the notion that any external agencies can reliably produce 
meaningfully greater levels of achievement among children, given the re- 
sults of large numbers of special efforts to date (Jenson, 1069: O'Reilly, 
1970) and the quality of instructional programs and materials available 
to agents of accountability. 

A fourth set of problems relates to the outcome measures proposed for 
use in accountability applications, typically standardized achievement 
tests. Timid criticisms suggest such tests are not entirely adequate as 
outcome measures, and must be supplemented by tests developed to measure 
other important outcomes (Lessinger, 1970). Other critics flatly conclude 
that standardized testing is irrelevant to the major questions asked in 
the accountability context, including those questions asked about 
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achievement of basic skills (Skager, 1971)* On a yet more basic plane, 
one may convincingly argue that valid definitions of outcomes of school 
performance are not yet generally available and that, therefore, no set 
of adequate outcome measures exists. This last argument is convincingly 
developed by Rohwer (1971) in a discussion of contemporary procedures 
used to generate the objectives of formal schooling. 

A fifth and final problem area relates to current capabilities to 
develop and operationalize relevant information systems and necessary data 
logistics and analytical systems required to answer primary questions of 
the accountability movement. Potentially adequate information systems 
are only now being tried on a small scale (Barro, 1970; Schutz, 1971) . 
Experience further shows the operations of such systems are exceedingly 
complex, and greatly dependent on the active support of teachers and other 
staff. Clearly, considerable research, development, and installation ac- 
tivity is yet required before generally applicable systems are available 
in support of accountability. Such information system applications are 
also likely to be shortlived or superficial if the data generated do not 
relate functionally to the short- and long-term information needs of such 
audiences as teachers, supporting staff, and students. 

The foregoing discussion presents a few supporting arguments for the 
simple proposition that engineering in education should be accomplished 
in full recognition of the nature and condition of the patient. The fla- 
vor of the accountability movement poses a new and seemingly powerful 
threat to the educational system. It is further a pretentious movement 
which assumes that precise causal effects can now be clearly demonstrated 
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in education, that these effects adequately define current values and 
needs, and that, therefore, related rewards and punishments can be estab- 
lished. Carried to extremes, the movement may be utterly demoralizing to 
those who assume close responsibility for the educational process by mak- 
ing questionable inroads into the functions that school personnel now 
regard as theirs (Cf. Mecklenburger and Wilson, 1971, 1971a). The ideals 
of responsiveness, relevance, and responsibility in education may, how- 
ever, yet be advanced given appropriate recognition of the organismic 
character of the system. The remainder of the discussion will focus in 
part on the philosophical basis of n system which embodies the inherently 
constructive spirit of accountability, but avoids many of the problems 
apparent in publicized notions which now communicate hostile intent to the 
teacher in the field. A major part of the discussion is given over to a 
description of the nature of the. system, its operation, and the assistance 
it affords the educational process. 

Alternatives to Accountability* 

Language of the System 

In the system to be described, the usual language of accountability 
is eschewed in favor of a more palatable notion that assumes responsibi- 
lity of teachers and others for acting upon information made available on 
the performance of certain job-related operations. Working with teachers, 
curriculum coordinators, and school administrators, we first identify 
educational decision-making domains for which supporting information can 
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be made available through computer based systems. Sample decision-making 
domains and related responsible agents are summarized In Table 1. 



Insert 'fable 1 about here 



The first decision-making domaLn in Table 1 relates to the problems 
of defining and validating the behavioral outcomes of education, other- 
wise known as curricula. Currently, we show teachers how to proceed from 
a formal analysis of local needs through the establishment of the four 
levels of objectives defined in Table 2. All levels of objectives shown 
are behavioral, and constitute, respectively, the standards of operation 



Insert Table 2 about here 



for a program, a level within a program (both approximations and course 
objectives) , and the series of behaviors which lead to the performance of 
complex operations, such as those defined by course objectives. For the 
past year, we have been engaged in the empirical derivation of a set of 
terminal objectives in reading, based on analyses of roles and reading 
performance requirements in actual life situations. Alternatively, for 
the other areas of the curriculum, teachers involved in our efforts attempt 
to establish their own terminal objectives on the basis of what they and 
others know of actual life requirements. 

In the domain of curriculum development, the appropriate agents assume 
the responsibility of first defining tentative curriculum standards in the 

n 

I 
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form of the levels of objectives established for the sytem. From there, 
Information systems based on criterion-referenced testing (CRT) (see 
Gorth, Schrlber, and O'Reilly, 1971) deliver data relevant to the task of 
constructing adequate curricula. The responsible agents then examine data 
In relation to such curriculum issues as determining the relevance of 
objectives, establishing more appropriate sequences of objectives, and 
defining realistic levels of performance. 

In the second domain in Table 1, the responsible agents Initially 
assume the tasks involved in the design or selection of the materials and 
procedures which constitute a program of Instruction. Appropriate CRT 
based systems then deliver Information at the course objective level and 
above. Actions are then taken to continually adjust methods and materials 
to the point where the standards specified by course and terminal objectives 
are approximated by group performance data. Decisions in this area involve 
the replacement of materials and approaches, the appropriate placement of 
review, the addition of needed branches to the program, and so on. 

In the third domain in Table 1, responsibility is assumed for the 
daily management of the instructional process, with the goal being opti- 
mization of learning er.vironmenta for individual students. Teacher deci- 
sions relate to such functions as: (a) placing the student within the 

curriculum at a given level; (b) assigning appropriate instruction; 

(c) tracking performance signals as the student proceeds through a given 
package of instruction; and (d) assigning appropriate forms of remedia- 
tion, selecting enrichment activities, or selecting the next set of 
objectives . 
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Moving finally to the domain of comparative product evaluation, we 
face a set of decision alternatives which ultimately must result in the 
selection of the relatively better program. Though we have not generally 
worked at this level, we would presume that program managers at different 
levels would essentially wish to seJ ~xt the best program on the basis of 
performance and costs. Included in this area is another set of decision 
alternatives resulting from an evaluative stance regarding schools or dis- 
tricts as objects to be subjected t*.o comparative evaluation. Given ade- 
quate sets of terminal performance objectives and related measurement 
procedures, this may prove a useful, stance upon which to base the rational 
allocation of resources, such as those available for special programs. 

SPPED: A Compucer Managed 

Information and Resource System* 

The final phase of this discussion will consider the measurement, 
evaluation, and resource systems used to support the decision-making 
domains given in Table 1. The broad outlines of a project called SPPED 
(System for Pupil and Program Evaluation and Development) will first be 
presented, followed by a brief discussion of the characteristics of a 
major evaluative component of the sytem known as CAM (Comprehensive 
Achievement Monitoring). 



*The SPPED system is under development by the New York State Educa- 
tion Department, in cooperation with staff from the University of Massa- 
chusetts (Dr. William Gor'th), Stanford University (Dr. Paul Pinsky) and 
State University of New York at Stony Brook (Dr. Shelley Harrison). 
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The SPPED Project 

The basic structure of the SPPED project is shown in Figure 1 as a 
set of five interrelated parts, each with a computerized support compo- 
nent. 



Place Figure 1 about here 



The first component of the SPPED is the central BOIR (Bank of 
Objectives, Test Items, and Instructional Resources), which contains 
extensively classified material used for: (a) curriculum development; 

(b) test item access; (c) derivation of local paper-based or computer- 
ized BOIR's; and (d) program design and installation . The complexity 
of the BOIR may be judged to some extent from the set of reading objec- 
tives now being stored. Approximately 2500 basic or generic objectives 
are being filed for computer access, along with 1000 course objectives 
and 800 terminal and approximate objectives. Each set of objectives is 
classified along as many as 50 different dimensions, each with one or 
several levels. In addition, the "0-bank" is to include several lists 
of elements which will enable the user to fill out each objective to 
the desired level of specificity (e.g., vocabulary lists, word ele- 
ments, phonetic elements, etc.). 

The primary intent of the "0-bank" is to automate the task of 
curriculum development through the process of computerized selection 
and modification. Entering the system, the user specifies a number of 
classifiers for content, level, and other identifiers. He then receives 
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a set of general or generic objectives to which he adds or specifics lists 
of elements (e.g., a word list), and indicates the number of levels Into 
which objectives are to be grouped (e.g., grade levels, modules). The 
resultant output is a structured curriculum with objectives and associ- 
ated elements grouped into the desired number of levels. Objectives 
are further organized into subgroups within levels; for each subgroup, 
output Includes one or more criterion objectives (course or summary 
objectives). Once this selection process is complete, the local 
reading curriculum is deliverable on hard copy, IBM cards, or magnetic 
tape. A computer program is also available to facilitate the processes 
of adding to the local bank, and editing and deleting material. 

The remaining components of the BOIR serve to index items and 
specific instructional resources in relation to each objective. The 
utility programs for each banking function allows continual up-date 
and refinement. Two secondary programs, Test Scheduling (TS) and 
Test Construction (TC), operate on the contents of the I-Bank to gen- 
erate, respective]/, random or specified test forms for CAM and 
mautery testing (MAST-T) , and the printing of test schedules employing 
random, stratified sampling for CAM testing. The TS and TC programs 
are not restricted to use of the BOIR as a data base. 

The components of the SPPED described to this point serve school 
personnel as basic resources for such functions as curriculum develop- 
ment, test development, instructional materials selection, and test 
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scheduling. The associated computer programs are designed to eliminate 
much of the massive paper handling and other clerical work that are nec- 
essarily part of any sizeable systems approach to instruction. The BOIR, 
TC, and TS programs were designed after four years of developmental work 
with CAM, and were meant to speed the process of CAM implementation. 

The reason for the additional SPPED resources becomes more apparent 
in the next section which briefly presents the specifications for CAM 
testing. 

Comprehensive Achievement Monitoring (CAM) 

A detailed technical description of CAM and related implementation 
procedures is presented in Gorth, Schriber and O'Reilly (1970). All of 
the decision-making is made on the basis of criterion referenced test 
results. The CAM design includes the following components: 

1. The definition of a curriculum with behavioral 
objectives ; 

2. The writing of test items to measure student 
performance on each objectives which are 
criterion-referenced test items; 

3. The organization of a set of randomly parallel 
tests, where each test is made up of all or a 
sample of items measuring all the objectives in 
the curriculum and therefore represents item 
sampling; 

4. The design of a longitudinal, usually every 
three or four weeks, schedule of test occasions 
throughout the course; 

5. The analysis of the test data and the reporting 
of results by computer, usually within a couple 
of days; 
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6. The interpretation of the results by evaluators, 
teachers and students as a means for msking bet- 
ter decisions about their instruction and cur- 
riculum, and 

7. The modification of curriculum, instructional 
activities snd the CAM design based upon the 
results . 

The CAM methodology has been designed to work well with any grade 
level or curricular area. In fact, it has already been used successfully 
in more than 20 schools, with more than 15,000 participating students, and 
at grade levels from 3rd to 12th and in every academic subject area (Allen 
and Gor th, 1971) . 

Particularly important to the success of this evaluation technique 
is the use of the computer. It alleviates the frequently encountered 
bottlenecks of most evaluations, the analysis of data and the reporting 
of results. The CAM computer program allows extensive freedom in the 
design of evaluations, incorporating both longitudinal testing with 
item sampling and mastery testing to correspond with traditional or 
unusual course designs. 

The information which is typically provided in the CAM system 
includes : 

1. For individual students 

a. the total score on the current test 
and all previous tests, and 

b. information on the correctness of 
their response to each item corre- 
sponding to course objectives on 
the current test; and 
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2. for any subgroup of students and any set of 
questions after each test administration 

a. the achievement level on each 
objective, and 

b. achievement profiles which dis- 
play graphically the level of 
achievement on all objectives 
on the previous test occasions. 

The latter information is identifiable as pretest, posttest, and long- 
term retention data for the duration of the course. 

The comr dr program allows students* achievement to be plotted on 
any given objective (or group of objectives) for the entire course. This 
plot, called an achievement profile , gives a graphic presentation of the 
changes in group achievement throughout the course. Achievement profiles 
are a unique type of information available from the CAM model. 

Figure 2 presents hypothetical achievement profiles for five objec- 
tives from a course. Brief comments below the graph give possible inter- 
pretations. It is obvious that achievement profiles provide a wealth of 
information, at whatever point in the course they are calculated. On 



Insert Figure 2 about here 



the pretest in the foregoing example, all objectives except number 2 show 
achievement at the chance level, or about 20 percent (five-option multiple- 
choice items). Several decisions could have been made after test adminis- 
tration one: (a) objective 1 was not learned, reteach it in some other 
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way; and (h) objective 2 has tested high on both the pretest and test 
administration 1, suggesting it would be safe to skip instruction in 
this objective. After test administration 5, two other decisions 
might have been made: (a) achievement on objective 3 seems to be 

slipping and review is needed; and (b) objective 8 seems closely 
related to objective 5 and perhaps should be taught now instead of 
later. CAM, therefore, represents an application of criterion- 
referenced testing to program evaluation performed with longitudinal 
evaluation using item sampling. 

Cone lusion 

This has been a brief foray into the characteristics of a 
developing system and underlying philosophy designed to enable a 
fly s terns approach to instruction in the local school context. In con- 
trast with a great many articles on systems and models of accountability, 
the efforts described here have been designed and partially or wholly 
implemented in full realization of local needs and local potential for 
change* The simple wisdom in the philosophical basis of our work ema- 
nates from experiences wi th hundreds of teachers over several years. 

Some contrasts between this developing philosophy and the characteristics 
of accountability philosophies now prevalent in the literature are sum- 
marized as follows: 

1. The language and focus of the system is on 
assuming responsibility for acting on infor- 
mation to effect changes in curricula. 
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instruction, and the learning activities of 
individuals -- as opposed to a direct focys 
on outcomes . 

2. Teachers and related staff assist in formu- 
lating the outcomes of schooling in contrast 
with the external imposition of performance 
standards . 

3. Teachers and related staff have a direct hand 
in specifying the measurements to be used in 
relation to determining performance, in 
selecting options available from different 
evaluation systems, and even in the formula- 
tion of evaluation systems as they are 
developed . 

4. Program and instructional decisions are pri- 
marily the responsibility of regular instruc- 
tional staff; external agents assist and advise 
in this process. 

5. Public reporting of performance data, only now 
being considered, is to be initially based on 
simple (+, -) accounting of skills mastered, 
with the intent of involving parents continuously 
and functionally in the instructional process. 

Comparative performance data on program, schools, 
and individuals are not projected for use on the 
local level. 

Administrators, board members, and others interested in accountability 
applications would do well to give very careful consideration to the various 
interrelated philosophical, psychological, technical, and logistical issues 
involved. Perhaps an appropriate starting point is to generate a local 
philosophy of accountability -- one which would be generally shared by all 
major participant groups. Once the sensitive issues are more or less set- 
tled and organized into a set of acceptable guidelines for development, the 
local leadership will more than likely find a more than adequate basis for 
proceeding with the business of accountability. 
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Table 1 



Selected Decision Making Domains for Which Test Data 
Are Made Available in CMI Systems 



Domain 


Primary 

Responsible Aeent 


1. Curriculum Development 


Curriculum teams, adminis- 




trators, parents 


2. Product Refinement 


Teachers, teacher teams, 




adminis trators 


3. Instructional Management 


Teachers, students, parents 


4. Comparative Product Evaluation 


Administrators , program 




managers 
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Table 2 

Levels of Objectives as Related to Derivation 



Levels of Objectives 


Derivation (from) 


Terminal Objective 


Actual or Intended life perfor- 
mance situations 


Approximations to 
Terminal Objectives 


Four successively less difficult 
levels derived from terminal 
objectives 


Course Objective 


A temporary "stand-in" for 
approximations] a criterion or 
course or summative objective 


Enabling Objective 


One of a series of lower level 
objectives derived from course 
objectives or approximations 



Figure 1 



Components of the SPPED Project 
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PERCENT OF QUESTIONS CORRECT 
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Figure 2 



Achievement Profiles of a Group on Five Objectives 



C ' 




•••• Objective 1: 



— Objective 2: 



taught, but students did not learn; with rapid 
feedback, could be corrected with change in 
instruction (taught just before week 1) 

previously known and not taught; without pre- 
test, this looks like student learning (taught 
just before week 2) 



Objective 3: 

Objective 4: well taught (taugfrt just before week 4) 



taught and learned, but forgotten (taught just 
before week 3) 



-•-Objective 8: appears related to objective 5, because achieve- 

ment Increases when 5 is taught (taught juat before 
'week 8) 
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Abstract 

This paper critically examines the visible claims of the 
current accountability movement and offers an alternative 
philosophy and a developing system. Areas examined Include the 
psychological Implications of accountability philosophies for 
teaching staff, certain educational measurement problems, and 
the availability or adequacy of operating systems required to 
support broad applications of accountability. The alternative 
philosophy offered focuses on operations for which teachers may 
legitimately be held responsible, as opposed to the current 
movement's overriding focus on student outcomes. The comprehensive 
supporting Information system Is presented In broad outlines, high- 
lighted by a presentation of an evaluative component called 
Comprehensive Achievement Monitoring. 
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